A Parameterized Approach to Integrating Aspect with Lexical-Semanics for Machine Translation
نویسنده
چکیده
This paper discusses how a two-level knowledge representation model for machine translation integrates aspectual information with lexical-semantic information by means of parameterization. The integration of aspect with lexical-semantics is especially critical in machine translation because of the lexical selection and aspectual realization processes that operate during the production of the target-language sentence: there are often a large number of lexical and aspectual possibilities to choose from in the production of a sentence from a lexical semantic representation. Aspectual information from the source-language sentence constrains the choice of target-language terms. In turn, the targetlanguage terms limit the possibilities for generation of aspect. Thus, there is a two-way communication channel between the two processes. This paper will show that the selection/realization processes may be parameterized so that they operate uniformly across more than one language and it will describe how the parameterbased approach is currently being used as the basis for extraction of aspectual information from corpora. I N T R O D U C T I O N This paper discusses how the two-level knowledge representation model for machine translation presented by Dorr (1991) integrates aspectual information with lexical-semantic information by means of parameterization. The parameter-based approach borrows certain ideas from previous work such as the lexical-semantic model of Jackendoff (1983, 1990) and models of aspectual representation including Bach (1986), Comrie (1976), Dowty (1979), Mourelatos (1981), Passonneau (1988), Pustejovsky (1988, 1989, 1991), and Vendler (1967). However, unlike previous work, the current approach examines aspectual considerations within the context of machine translation. More recently, Bennett *This paper describes research done in the Institute for Advanced Computer Studies at the University of Maryland. A special thanks goes to Terry Gaasterland and Ki Lee for helping to close the gap between properties of aspectual information and properties of lexical-semantic structure. In addition, useful guidance and commentary during this research were provided by Bruce Dawson, Michael Herweg, Jorge Lobo, Paola Merlo, Norbert Hornstein, Patrick SaintDizier, Clare Voss, and Amy Weinberg. (1) S y n t a c t i c : ( a ) N u l l S u b j e c t d i v e r g e n c e : E: I have seen M a r y 4. S: He vls to a M a r l s (Have seen ( to ) M a r y ) (b) C o n s t i t u e n t O r d e r d i v e r g e n c e , E: I have seen M a r y 4. G: Ich h a b e Mar ie gesehen ( I have Mar~" seen) (2) L e x i c e l S e m a n t i c : (a) Thematic divergence: E: I like M a r y 4. $: M a r l s me gusts a mf (Mary pleases me) (b) S t r u c t u r a l d i v e r g e n c e : E: John entered the house 4. S: J u a n entr6 en la cas& (John entered in the house) (c) C a t e s o r l a l d i v e r g e n c e : E: Yo ten~o h a m b r e 4* S: Ich h a b e Hun~er ( I have hun~er) (3) A e p e c t u a h (a) l te ra t i ve Divergence: E: John stabbed M a r y 4. S: J u a n le dio una puf laJada a M a r l s (John gave a knife-wound to M a r y ) S: J u a n le dio puf ia ladas a M a r l s (John gave knife-wounds to M a r y ) (b) D u r a t l v e D i v e r g e n c e , E: John m e t / k n e w M a r y 4* S: J u a n coaoc i6 a M a r l s ( J o h n m e t M a r y ) S: J u a n conoci£ a M&rfa (John knew Meri t ) Figure 1: Three Levels of MT Divergences et el. (1990) have examined aspect and verb semantics within the context of machine translation in the spirit of Moens and Steedman (1988). This paper borrows from, and extends, these ideas by demonstrating how this theoretical framework might be adapted for crosslinguistic applicability. The framework has been tested within the context of an interlingual machine translation system and is currently being used as the basis for extraction of aspectual information from corpora. The integration of aspect with lexical-semantics is especially critical in machine translation because of the lexical selection and aspectual realization processes that operate during the production of the target-language sentence: there are often a large number of lexical and aspectual possibilities to choose from in the production of a sentence from a lexical semantic representation. Aspectual information from the source-language sentence constrains the choice of target-language terms. In turn, the target-language terms limit the possibilities for generation of aspect. Thus, there is a two-way communication channel between the two processes. Figure 1 shows some of the types of parametric diver9ences (Dorr, 1990a) that can arise cross-linguistically.
منابع مشابه
A Hybrid Machine Translation System Based on a Monotone Decoder
In this paper, a hybrid Machine Translation (MT) system is proposed by combining the result of a rule-based machine translation (RBMT) system with a statistical approach. The RBMT uses a set of linguistic rules for translation, which leads to better translation results in terms of word ordering and syntactic structure. On the other hand, SMT works better in lexical choice. Therefore, in our sys...
متن کاملThe Correlation of Machine Translation Evaluation Metrics with Human Judgement on Persian Language
Machine Translation Evaluation Metrics (MTEMs) are the central core of Machine Translation (MT) engines as they are developed based on frequent evaluation. Although MTEMs are widespread today, their validity and quality for many languages is still under question. The aim of this research study was to examine the validity and assess the quality of MTEMs from Lexical Similarity set on machine tra...
متن کاملLexical Chains meet Word Embeddings in Document-level Statistical Machine Translation
The phrase-based Statistical Machine Translation (SMT) approach deals with sentences in isolation, making it difficult to consider discourse context in translation. This poses a challenge for ambiguous words that need discourse knowledge to be correctly translated. We propose a method that benefits from the semantic similarity in lexical chains to improve SMT output by integrating it in a docum...
متن کاملExamining the Effect of Ideology and Idiosyncrasy on Lexical Choices in Translation Studies within the CDA Framework
Using a critical discourse analytic model of translation criticism, the present study attempts to explore the effect of ideology and idiosyncrasy on the lexical choices in translation studies. The study employed a descriptive approach to answer two research questions: Is there any relationship between ideology and idiosyncratic features of translators' lexical choices? And if yes, can it be ana...
متن کاملMachine translation - a view from the lexicon
Books describing novel approaches to machine translation (MT) are always welcome. This is all the more so when the approach is one not covered by general MT surveys such as those in Hutchins and Somers (1992) or Arnold et al. (1994). Bonnie Jean Dorr's Machine Translation: A View from the Lexicon is a book with a novel approach. It describes the interlingual MT system UNITRAN rooted in two Mass...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 1992